Search CORE

6 research outputs found

A GPU-accelerated Direct-sum Boundary Integral Poisson-Boltzmann Solver

Author: Geng Weihua
Jacob Ferosh
Publication venue: 'Elsevier BV'
Publication date: 24/01/2013
Field of study

In this paper, we present a GPU-accelerated direct-sum boundary integral method to solve the linear Poisson-Boltzmann (PB) equation. In our method, a well-posed boundary integral formulation is used to ensure the fast convergence of Krylov subspace based linear algebraic solver such as the GMRES. The molecular surfaces are discretized with flat triangles and centroid collocation. To speed up our method, we take advantage of the parallel nature of the boundary integral formulation and parallelize the schemes within CUDA shared memory architecture on GPU. The schemes use only

11N+6N_c

size-of-double device memory for a biomolecule with

N

triangular surface elements and

N_c

partial charges. Numerical tests of these schemes show well-maintained accuracy and fast convergence. The GPU implementation using one GPU card (Nvidia Tesla M2070) achieves 120-150X speed-up to the implementation using one CPU (Intel L5640 2.27GHz). With our approach, solving PB equations on well-discretized molecular surfaces with up to 300,000 boundary elements will take less than about 10 minutes, hence our approach is particularly suitable for fast electrostatics computations on small to medium biomolecules

arXiv.org e-Print Archive

CiteSeerX

Representing Clones in a Localized Manner

Author: Gray Jeff
Jacob Ferosh
Tairas Robert
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/05/2011
Field of study

International audienceCode clones (i.e., duplicate sections of code) can be scattered throughout the source files of a program. Manually evaluating a group of such clones requires observing each clone in its original location (i.e., opening each file and finding the source location of each clone), which can be a time-consuming process. As an alternative, this paper introduces a technique that localizes the representation of code clones to provide a summary of the properties of two or more clones in one location. In our approach, the results of a clone detection tool are analyzed in an automated manner to determine the properties (i.e., similarities and differences) of the clones. These properties are visualized directly within the source editor. The localized representation is realized as part of the features of an Eclipse plug-in called CeDAR

INRIA a CCSD electronic archive server

HAL Mines Nantes

Y.: A Platform-Independent Tool for Modeling Parallel Programs

Author: Ferosh Jacob
Jeff Gray
Purushotham Bangalore
Yu Sun
Publication venue
Publication date: 01/01/2011
Field of study

ABSTRACT Programming languages that can utilize the underlying parallel architecture in shared memory, distributed memory or Graphics Processing Units (GPUs) are used extensively for solving scientific problems. However, from our observation of studying multiple parallel programs from various domains, such programming languages have a substantial amount of sequential code mixed with the parallel code. When rewriting the parallel code for another platform, the same sequential code is often reused without much modification. Although this is a common occurrence, existing tools and programming environments do not offer much support for this process. In this paper, we introduce a tool named PPmodel, which was designed and implemented to assist programmers in separating the core computation from the details of a specific parallel architecture. Using PPmodel, a programmer can identify and retarget the parallel section of a program to execute in a different platform. With PPmodel, a programmer is better enabled to focus on the parallel section of interest, while ignoring other parallel and sequential sections in a program. The tool is explained by example execution of the parallel section of an OpenMP program for the circuit satisfiability problem in a cluster using the Message Passing Interface (MPI)

CiteSeerX

Modeling of high performance programs to support heterogeneous computing

Author: Bangalore Purushotham
Carver Jeffrey
Coady Yvonne
Dixon Brandon
Gray Jeff
Jacob Ferosh
Kraft Nicholas
University of Alabama. Dept. of Computer Science
Vrbsky Susan
Publication venue
Publication date
Field of study

In order to harness the power of multicore CPUs and GPUs, HPC (High Performance Computing) programmers and even end-users need new tools and techniques to express their core problem, divide that core problem into sub problems, allocate computational resources for the sub problems, execute the resources, and collect results. HPC users focus more on the problem domain while HPC programmers are concerned with the code or HPC domain. However, in current practice, the distinction of programmers and users is not clearly delineated because most of the end-users (e.g., scientists who have a computational need) must create and write their own HPC code. Moreover, HPC users also have to maintain the HPC source code to keep abreast with the latest advances, techniques and platforms introduced by the HPC programming community. The specific aim of this dissertation is to introduce new software engineering ideas (e.g., Model-Driven Engineering (MDE) and Domain-Specific Languages (DSLs)) and supporting tools to assist in the evolution of parallel programs used by HPC programmers, as well as HPC users. In this dissertation, we show that tool support can be provided for HPC programs at different levels of abstraction targeted for a specific set of users. These levels of abstraction are: 1) Code-level, 2) Algorithm-level, 3) Program-level, and 4) Sub-domain-level. We designed, implemented, and evaluated DSLs at each abstraction level to support heterogeneous computing. Code-level abstraction is very general and it can be applied to any C/C++ program, while algorithm-level abstraction is only applicable for programs implementing MapReduce algorithms. Compared to code-level and algorithm-level abstraction, program-level and sub-domain-level abstractions are very specific and are only applicable to specific domains and users (e.g., Signature Discovery Intiative (SDI) project participants and Nbody solution users). We observed that if the domain is specific, less information is required from the user because the DSLs are domain-aware. If the domain is very general (e.g., in the code-level and algorithm-level abstractions), there are more application usage areas for the DSL, but adoption of the DSL at more general levels requires additional information from the end-users. (Published By University of Alabama Libraries

University of Alabama Libraries: Acumen